CV-Convolutional and image filtering
Convolution
2D Convolutional Kernel 原点在正中间
是输出图像的像素值; 是输入图像的像素值,表示在位置 的邻域像素; 是卷积核的权重; 是卷积核的半径,卷积核大小为 。
Convolution Result
- Height
- Width
Example
If the input data is an image of size
So, the output size is
Padding
Preserve Spatial Dimensions (Same Padding)
- Without padding, the output size of a convolution layer decreases after every operation. Padding helps preserve the original input size by compensating for the loss of border pixels during convolution.
For example:
- Input size =
- Kernel size =
- Without padding: Output size =
- With “same” padding: Output size =
Prevent Information Loss at Borders
Without padding, the convolution operation does not consider the edges and corners of an image as much as the center regions. Padding ensures the borders are included in the computation, retaining more information.
Enable Deeper Networks
By maintaining consistent feature map sizes throughout the layers, padding allows for the design of deeper networks without rapidly shrinking the feature map size.
Control Output Size
Padding can be adjusted to produce desired output dimensions for specific applications (e.g., “same” padding or “valid” padding).
Improve Symmetry for Feature Extraction
Padding ensures that the kernel interacts symmetrically with all regions of the input image, which improves the extraction of features near edges.
Sampling in 2D
Downsampling
Assume we have a matrix A with size
UpSampling
Assume we have a matrix
Laplacian in 2D
Assume we have a 8×8 matrix (image) B:
The Laplacian filter in 2D (for edge detection) is commonly represented as:
If we use the matrix to represent the Laplacian, the matrix is:
And the result is:
Question
Part I: Convolution and image filtering
1. Comparing different filters?
2. Comparing different scales / size of filter?
3.Separability property of a filter / convolution?
4. Convolution and correlation? How
5. How to work on a kernel approximating a 1st, 2nd derivative?
What is Fourier Transform? What is the usage? How to calculate in 1D? 2D? Padding if necessary.
Convolution in image domain is equivalent to multiplication in frequency domain. Why? Verify?
8. How to obtain image pyramid? Gaussian, Laplacian, Steerable? Calculate? What are the usages / applications of them?